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Participants 


The  following  personnel  were  the  main  participants  in  this  grant: 

Principal  Investigator:  Dr.  Anthony  Giles  Warrack 

Co-PI:  Dr.  Alexandra  Kurepa 

ONR  Technical  Director:  Dr.  Rabinder  Madan 

NUWC-Newport  Points  of  Contact:  Dr.  Roy  L.  Streit,  Dr.  Marcus  L.  Graham 
Graduate  Students:  Ms.  Latoya  Silochan,  Ms.  Kashonda  Bynum,  Mr.  Rodolfo  Bernal, 
Ms.  Alisha  Williams 

Undergraduate  Students:  Ms.  Angela  Edwards,  Mr.  Bryahn  Ivery,  Mr.  Dustin  Lupton, 
Mr.  James  Pender,  Mr.  Terrell  Felder,  Ms.  Krystal  Knight 

Under  the  terms  of  the  original  proposal,  submitted  in  2003,  a  cohort  of  two  graduate 
students  (Silochan  and  Bynum)  and  four  undergraduate  sophomore  students  (Edwards, 
Lupton,  Ivery,  and  Pender)  was  selected.  Due  to  the  fact  that  some  funding  was  delayed, 
and  also  that  there  were  extra  funds  available  as  in-state  rather  than  out-of-state  students 
had  been  recruited,  a  no-cost  one  year  extension  was  applied  for,  and  granted,  enabling 
support  for  two  more  graduate  students,  Mr.  Ricardo  Bernal  and  Ms  Alisha  Williams,  and 
two  more  undergraduate  students,  Ms  Krystal  Knight  and  Mr.  Terrell  Felder.  Thus  the 
grant  has  supported  6  undergraduate  and  4  graduate  students. 

The  grant  was  undertaken  with  close  cooperation  at  The  Naval  Undersea  Warfare  Center, 
Newport,  RI  (NUWC-Newport)  where  Dr.  Warrack  had  held  ONR/ASEE  Summer 
Faculty  Research  Fellowships.  On  March  30,  2004  Dr.  Rabinder  Madan,  the  ONR 
Technical  Director  for  the  grant  with  Dr.  Roy  L.  Streit  of  NUWC,  visited  the  A&T 
campus  to  confer  with  the  Pi’s  and  the  students,  prior  to  the  two  graduate  students 
summer  internship  at  NUWC.  When  Dr.  Streit  resigned  from  NUWC-Newport  in  2005, 


3 


Dr.  Marcus  L.  Graham  became  the  point  of  contact.  Dr.  Errol  G.  Rowe  at  NUWC  also 
cooperated. 


Activities  and  Findings 

1.  Research  Summary 

The  research  has  concentrated  in  the  area  of  parameter  estimation  in  mixtures  of  normal 
probability  distributions,  in  particular  as  it  pertains  to  problems  of  multi-target  tracking. 
We  have  been  involved  in  the  implementation  of  algorithms  and  estimation  methods, 
such  as  the  E-M  Algorithm,  the  Kalman  filter,  and  smoothing  methods,  both  parametric 
and  non-parametric.  We  have  also  investigated  the  relative  merits  of  various  statistics  in 
evaluating  statistical  models,  such  as  the  Akaike  Information  Criterion  (AIC),  and  the 
Bayes  Information  Criterion  (BIC),  particularly  in  situations  where  the  number  of 
populations  (targets)  is  one  of  the  parameters  to  be  estimated.  Another  area  of  interest  has 
been  in  the  field  of  obtaining  good  initial  “guesses”  of  parameters,  where  we  have 
compared  various  clustering  techniques  and  algorithms. 

Another  problem  of  interest  has  been  that  of  maintaining  “target  identity”,  when  two 
target  trajectories  either  cross  or  pass  very  close  to  each  other.  We  attempted  to  combine 
smoothing  methods  with  the  EM  Algorithm,  and  were  interested  in  comparing  standard 
methods,  e.g.  linear  or  polynomial  regression,  with  methods  that  make  fewer  model 
assumptions,  such  as  smoothing  splines,  Loess  regression,  and  Kernel  Regression.  In  the 
later  case  we  were  interested  in  comparing  the  combination  of  different  kernel  functions 
combined  with  different  bandwidths. 
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2.  Research  Presentations 


At  all  stages  the  students  have  been  encouraged  to  present  their  work  on  a  formal  basis. 
The  following  is  a  list  of  presentations  that  have  been  made  by  both  the  faculty  and 
student  participants  in  the  grant: 

" Applying  the  Unscented  Kalman  Filter  to  Problems  in  Submarine  Tracking" 

Dr.  Giles  Warrack  and  Dr  Alexandra  Kurepa,  The  Third  National  Ronald  E.  McNair 
Symposium  on  Science  and  Technology  :  North  Carolina  Agricultural  and  Technical 
State  University,  January  28,  2004 

" Simulating  a  Target  Tracking  Problem  Using  the  Kalman  Filter "  , Latoya  Silochan, 

Math  Awareness  Mini-Conference,  North  Carolina  Agricultural  and  Technical  State 

University 

April  22,  2004 

"Searching  Algorithms  using  the  Kendall-Wei  Algorithm ",  Kashonda  Bynum,  Math 
Awareness  Mini-Conference,  North  Carolina  Agricultural  and  Technical  State 
University 
April  22,  2004 

‘ Applying  Density  Estimation  and  Nonparametric  Smoothing  Techniques  to  Tracking 
Problems”  Latoya  Silochan  and  Kashonda  Bynum,  presentation  to  NUWC-Newport 
Code  22  members,  July  23,  2004 

“ Multiple  Target  Tracking  Using  Functional  Density  Estimation”,  Kashonda  Bynum, 
Latoya  Silochan,  The  Fourth  National  Ronald  E.  McNair  Symposium  on  Science  and 
Technology  :  North  Carolina  Agricultural  and  Technical  State  University,  January  29, 
2005 
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“ Using  Parametric  and  Nonpar ametric  Smoothing  Techniques  to  Improve  Estimation 
with  the  EM  Algorithm  ”,  Angela  Edwards,  Dustin  Lupton,  Bryahn  Ivery,  and  James 
Pender.  Presentation  given  at  Chaffee  Auditorium,  NUWC-Newport,  July  22,  2005 

“ Submarine  Target  Tracking  Simulations  Using  Mathematical  Modelling ”,  Dustin 
Lupton  and  James  Pender.  Math  Awareness  Mini-Conference,  North  Carolina 
Agricultural  and  Technical  State  University,  April  24,  2006 

“Using  Tree  Based  Methods  to  Classify  Messages”,  Terrell  A.  Felder,  Math  Awareness 
Mini-Conference,  North  Carolina  Agricultural  and  Technical  State  University,  April  19, 
2007 

“ Applying  Logistic  regression  to  Message  Classification’ ’,  Krystal  A.  Knight,  Math 
Awareness  Mini-Conference,  North  Carolina  Agricultural  and  Technical  State 
University,  April  19,  2006 

3).  Library  of  R/S-Plus  Programs 

During  the  course  of  the  research  a  collection  of  computer  programs  for  the  E-M 
Algorithm,  the  Kalman  Filter,  and  the  Unscented  Kalman  Filter  were  written,  many  in 
collaboration  with  Dr.  Errol  G.  Rowe  of  NUWC-Newport.  These  are  included  in 
Appendix  A. 

Education  and  Results 


1).  Initial  Training 

All  students  in  the  initial  cohort  were  given  special  instruction  through  a  series  of 
lectures  and  seminars,  as  well  as  being  required  to  take  courses  in  Probability,  Linear 
Models,  and  Statistical  Inference.  The  lectures  and  seminars  covered  more  specialized 
topics  than  would  generally  be  included  in  standard  courses.  The  topics  included 
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Bayesian  Estimation,  Filtering  methods,  Nonparametric  Regression,  Monte-Carlo 
simulation,  Bootstrapping,  Maximum  Likelihood  Estimation  using  the  E-M  Algorithm, 
and  Kernel  Density  Estimation.  The  students  were  also  taught  to  program  in  MATLAB 
and/or  R/S-Plus. 

2).  Graduate  Student  Research  at  NUWC-Newport 

In  the  summer  of  2004  the  two  graduate  students,  Latoya  Silochan  and  Kashonda  Bynum, 
accompanied  Dr.  Warrack  on  a  10  week  internship  to  NUWC-Newport,  where  they 
worked  under  the  supervision  of  Dr.  Warrack,  and  Dr.  Roy  L.  Streit  (Code  22).  The 
students  worked  on  a  project  involving  the  application  of  functional  density  estimation 
techniques  to  multiple  target  tracking.  The  work  combined  Bayesian  updating  estimation 
over  time,  with  comparisons  of  different  kernels  (e.g.  Gaussian,  Epachnikov),  in  density 
estimation,  and  different  choices  of  bandwidth  (the  “Bias- Variance”  tradeoff).  At  the  end 
of  the  10  week  period  the  students  made  a  presentation  to  members  of  Code  22  at 
NUWC-Newport  entitled  “ Applying  Density  Estimation  and  Nonparametric  Smoothing 
Techniques  to  Tracking  Problems  The  students  attended  seminars  and  lectures  given  at 
NUWC.  They  also  sat  in  on  ILIR  sessions,  in  which  NUWC  researchers  made 
presentations  for  the  process  of  in-house  research  funding. 


3).  Undergraduate  Student  Research  at  NUWC-Newport 

In  the  summer  of  2005  the  four  undergraduate  students,  Angela  Edwards,  Dustin  Lupton, 
Bryahn  Ivery,  and  James  Pender  accompanied  Dr.  Warrack  for  a  10  week  internship  at 
NUWC-Newport.  The  NUWC  point  of  contact  was  Dr.  Marcus  L.  Graham  (Dr.  Streit 
having  left  NUWC).  Again  the  students  attended  in-house  NUWC  lectures  and 
presentations  on  non-classified  material  and  ILIR  sessions.  They  collaborated  on  a 
project  using  simulated  data  and  along  with  all  the  other  student  interns  made  a 
presentation  in  the  Chaffee  Auditorium  at  NUWC  entitled  “ Using  Parametric  and 
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Nonparametric  Smoothing  Techniques  to  Improve  Estimation  with  the  EM  Algorithm 
Using  simulated  data,  the  students  attempted  to  track  varying  numbers  of  targets  using 
the  E-M  (Expectation-Maximisation)  Algorithm,  both  when  the  numbers  of  targets  are 
known,  and  when  they  are  unknown.  One  problem  addressed  was  that  of  attempting  to 
use  various  types  of  smoothing  routines  to  maintain  target  identity  when  target 
trajectories  either  cross,  or  pass  very  close  to  each  other.  As  well  as  trying  standard 
methods,  such  as  linear  or  polynomial  regression,  they  also  experimented  with  more 
“model  free”  methods  such  as  cubic  splines,  Loess  regression,  and  kernel  regression 
(comparing  different  kernels  and  bandwidths).  An  attempt  was  made  to  tackle  the  much 
more  difficult  problem  of  tracking  when  the  number  of  targets  varies  and  is  one  of  the 
parameters  to  be  estimated.  This  problem  was  treated  by  a  “penalized  likelihood” 
approach  which  compensates  for  the  fact  that  more  complicated  models  will  have  higher 
likelihoods  than  smaller  sub-models  by  penalizing  models  with  larger  numbers  of 
parameters.  Two  standard  statistics  that  do  this  are  Akaike  Information  Criterion  (AIC), 
and  the  Bayes  Information  Criterion  (BIC).  In  so  far  as  it  was  possible  to  compare  the 
two,  empirical  evidence  based  on  simulated  data  seemed  to  indicate  that  the  BIC  was 
marginally  superior,  in  others  words  it  had  a  slightly  higher  probability  of  selecting  the 
correct  model.  It  also  has  the  attractive  feature  that  it  can  be  shown  to  be  the  posterior 
probability  for  a  model,  given  the  data. 

4).  Conclusions  Regarding  NUWC  Internships 

All  six  students  are  on  record  as  saying  that  they  believe  they  benefited  enormously  from 
the  summer  internships.  They  were  very  impressed  by  the  professionalism  of  the  staff  at 
NUWC,  and  felt  they  had  been  very  well  treated  there.  It  certainly  gave  them  exposure  to 
a  professional  working  environment.  They  also  developed  a  certain  intellectual  initiative, 
and  the  ability  to  work  on  problems  on  their  own  for  extended  periods. 
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5).  Further  Graduate  Student  Research  at  NC  A&T 


On  returning  to  North  Carolina  A&T  from  NUWC,  Latoya  Silochan  wrote  a  Master’s 
Degree  project  under  the  supervision  of  Dr.  Warrack  entitled  “An  E-M  Based  Algorithm 
for  Maintaining  Target  Identity ”,  in  which  she  combined  parametric  polynomial 
smoothing  with  the  E-M  Algorithm.  This  was  presented  in  April  2005.  Ms  Bynum  wrote 
an  MS  project  under  the  supervision  of  Dr.  Bolindrah  Borah  entitled  “ Solution  Methods 
of  Non-homo  geneous  Partial  Differential  Equations  Rodolfo  Bernal  did  a  project  under 
the  supervision  of  Dr.  Warrack,  “ Improving  Estimation  in  the  E-M  Algorithm  by  PAVA 
Smoothing  In  this  he  considered  the  estimation  of  the  parameters  of  a  normal  mixture 
distribution  when  the  mixing  probabilities,  and  the  means  of  the  mixtures  are  known  to 
have  the  same  orderings,  e.g.  pi  <  p2  <  . . .  <  pk  ,  and  pi  <  P2  <  . . .  <  pk .  He  attempted  to 
incorporate  the  techniques  of  Isotonic  Regression  using  the  “Pool  Adjacent  Violators 
Algorithm”  (PAVA)  into  the  estimation  process.  While  the  estimation  appeared  to  me 
marginally  improved,  the  resulting  algorithm  was  considerably  slower.  Ms.  Alisha 
Williams  is  currently  working  with  Dr.  Kurepa  on  applications  of  Monte  Carlo  methods 
to  Partial  Differential  Equations. 


Effect  on  Career  and  Professional  Development 

Both  Ms  Silochan  and  Ms  Bynum  graduated  with  MS  degrees  in  May  2005.  Ms  Silochan 
applied  for  a  position  at  the  Naval  Surface  Warfare  Center  (NSWCDD)  ,  Dahlgren,  VA 
She  was  offered,  and  accepted  a  position  as  a  Mathematician/Statistician,  which  she 
accepted.  However  she  was  unable  to  obtain  the  required  security  clearance,  and  so 
accepted  a  position  as  a  Mathematical  Statistician  with  the  US  Postal  Service,  where  she 
now  works.  Ms  Bynum  was  also  offered  a  position  with  NSWCDD  but  she  too  was 
unable  to  obtain  security  clearance.  She  is  working  as  a  lecturer  and  academic  counselor 
at  the  Center  for  Academic  Excellence  at  North  Carolina  A&T.  Mr  Rodolfo  Bernal 
graduated  with  an  MS  degree  in  May  2006.  He  was  offered  a  job  as  Systems  Engineer  at 
NSWC,  but  was  then  told  this  would  have  to  be  put  on  hold  because  of  a  hiring  freeze. 
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He  has  also  applied  for  a  position  with  the  US  Census  Bureau.  He  is  currently  working 
for  Your  Choice  Health  Services  in  Raleigh,  NC. 

James  Pender,  Dustin  Lupton,  Bryahn  Ivery,  and  Angela  Edwards  all  graduated  in  2006 
with  BS  (Mathematics)  degrees.  James  Pender  has  been  working  at  NSWCDD  since 
January  2007  At  the  moment  he  is  working  with  the  Aegis  Ballistic  Missile  Defense 
System  (ABMD)  in  the  Command  and  Decision  (C&D)  section.  He  has  recently  been 
selected  from  a  competitive  pool  of  candidates  for  a  position  doing  baseline  testing  for 
the  Japan  Ballistic  Missile  Defense  (JABMD)  System.  Dustin  Lupton  is  employed  by 
Progress  Energy  as  an  Auxiliary  Operator  at  the  Brunswick  Nuclear  Plant,  in  Southport 
NC.  He  is  studying  to  qualify  as  an  NRC  licensed  nuclear  reactor  operator.  Bryahn  Ivery, 
after  teaching  for  a  year  has  also  applied  to  NSWCDD,  also  the  NS  A  and  the  Census 
Bureau.  Angela  Edwards  has  had  health  problems  which  has  made  permanent 
employment  difficult. 

There  is  no  doubt  that  the  grant  significantly  impacted  the  career  choices  of  most  of  the 
students  involved,  as  well  as  giving  them  exposure  to  areas  of  applied  mathematics, 
probability  and  statistics  they  would  not  otherwise  have  encountered. 

Demographic  Data 

Of  the  four  original  undergraduates,  all  of  whom  graduated  with  GPA’s  between  3.00  and 
3.75,  three  were  African  American,  and  one  White.  Three  were  male,  and  one  female.  Of 
the  three  graduate  students  who  graduated,  two  were  African  American,  and  one 
Hispanic.  Of  the  students  currently  enrolled,  the  graduate  student  is  female  and  African 
American,  the  two  undergraduates  are  both  African  American,  one  male,  the  other 
female.  All  three  are  performing  excellently  in  their  classes. 
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APPENDIX  A 

#Kalman  Filter  R/S-Plus  program  for  2-Dimensional  Tracking 
instructions:  To  load  the  file  into  R's  workspace,  type 

#  source(7home/...  path  to  your  code  ...  /kalman.R") 

# 

#  To  run  the  program,  type 

# 

#  (a)  library(MASS)  #Just  once  per  R  session. 

#  #contains  ginv(generalized  matrix  inverse) 

# 

#  (b)  kalmanR(N)  #N  is  number  of  observations. 

# 

# 

#library(MASS)  #contains  ginv  -  generalized  matrix  inverse: 
kalmanR  <-  function(N)  { 

#  N  -  the  number  of  observations. 

# 

m  <-  numeric;  m  <-  2; 
n  <-  numeric;  n  <-  4; 
dt  <-  numeric;  dt  <-  1 ; 

accel  <-  0.5; 
obsStd  <-  .75; 

xHat  <-  numeric(n); 

xHat  <-  c(0.0,  0.0,  0.0,  0.0);  #(x_pos,  x_velocity,  y_pos,  y_velocity) 

Path  <-  matrix(0,2,N) 
path_Hat  <-  matrix(0,2,N) 

Soln  <-  numeric(n); 

Soln  <-  xHat; 

Phi  <-  diag(n);  identity  4-by-4  matrix: 

Phi[l,2]  <-  dt;  #x-component  update 
Phi[3,4]  <-  dt;  #y-component  update 

P  <-  morm(n*n); 
dim(P)  <-  c(n,n); 

Q  <-  diag(n); 

Q[l,l]  <-  dtA4  /  4;  Q[1 ,2]  <-  dtA3  /  2;  Q[2,l]  <-  dtA3  /  2;  Q[2,2]  <-  dtA2; 
Q[3,3]  <-  dtA4  /  4;  Q[3,4]  <-  dtA3  /  2;  Q[4,3]  <-  dtA3  /  2;  Q[4,4]  <-  dtA2; 
Q  <-  accel A2  *  Q; 

P  <-  Q; 


ll 


M  <-  matrix(0,m,n); 

M[l,l]  <-  1.0; 

M[2,3]  <-  1.0; 

R  <-  obsStd  *  obsStd  *  diag(m); 

Phi_P  <-  matrix(0,n,n); 

PhiPMprime  <-  matrix(0,n,m); 

B_Mprime_plusR  <-  matrix(0,m,m); 

MP  <-  matrix(0,m,n); 

for  (i  in  1  :N)  { 
epsl  <-  morm(l); 
eps2  <-  morm(l); 

processNoise  =  accel  *  c(epsl*dtA2  /  2,  epsl*dt,  eps2*dtA2  /  2,  eps2*dt); 

Soln  <-  Phi  %*%  Soln  +  processNoise; 

w  <-  obsStd  *  morm(m); 

z  <-  M  %*%  Soln  +  w;  #This  is  our  observation: 

Innovation  <-  z  -  M  %*%  xHat;  #Difference  between  approximation  &  observation: 

B  Mprime  plusR  <-  M  %*%  P  %*%  t(M)  +  R; 

w  <-  solve(B_Mprime_plusR, Innovation); 

xHat  <-  Phi  %*%  xHat  +  Phi  %*%  P  %*%  t(M)  %*%  w; 

Path[l,i]  <-  Soln[l];  Path[2,i]  <-  Soln[3] 
path_Hat[l,i]  <-  xHat[l];  path_Hat[2,i]  <-  xHat[3] 

P  <-  Phi  %*%  P  %*%  t(Phi)  -  Phi  %*%  P  %*%  t(M)  %*% 
ginv(B_Mprime_plusR)  %*%  M  %*%  P  %*%  t(Phi)  +  Q 

} 

results  <-  data.frame(Path[l,],Path[2,],path_Hat[l,],path_Hat[2,]) 
plot(results[,l],results[, 2], col-blue', type=T,xlab="Truth  =  blue,  approx  =  red",ylab=" ") 
points(results[,3],results[,4],col='red',type=T) 
results 

} 
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R/S/Plus  Program  For  Estimation  of  Parameters  in  k  Gaussian  Mixtures 

***## 

##  Program  "EMkMixEstimation.R" .  This  programe  contains  3  subroutines:  ## 
##  1).  KmixGenerate  generates  a  mixture  of  k  normals,  sample  size  n  ## 

##  2).  StartMixK  uses  sample  quantiles  to  generate  startinf  means,  sds  ## 

##  3).  emK  computes  EM  estimates  of  mu's  sd's  probs  for  any  k  ## 

***## 

##  Subroutine  to  Generate  a  sample  of  size  n  of  a  mixture  of  k  normals  ## 

KmixGenerate  <-  function(n, means, sds, probs)  { 
k  <-  length(means) 

nk  <-  rmultinom(  1  ,n, probs)  ##USE  MULTINOMIAL  TO  GENERATE 
MIXTURES 

x  <-  numeric(O) 
for  (i  in  1  :k)  { 

x  <-  c(x,morm(nk[i],means[i],sds[i])) 
print(nk[i]) 

} 

retum(x) 

} 

##  Now  Generate  Starting  Values  ## 

StartMixK  <-  function(x,k)  { 
y  <-  sort(x) 
means  1  <-  numeric(k) 
sdsl  <-  numeric(k) 

##  Uniform  prior  probs  assumed,  but  any  could  be  used  ## 

probs  1  <-  rep(l,k)/k 

breaks  <-  c(0,cumsum(probsl)) 

##  Computes  required  number  of  quantiles  ## 

quants  <-  quantile(x, breaks) 

##  This  loop  computes  the  starting  means  and  std  devs  ## 

for  (i  in  1  :k)  { 

a  <-  quants  [i] 
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b  <-  quants  [i+1] 

means  1  [i]  <-  mean(y[y  >=  a  &  y  <=  b]) 

sdsl  [i]  <-  sd(y[y  >=  a  &  y  <=  b]) 

} 

retum(list(means  1  ,sds  1  ,probs  1 )) 

} 


***## 

##  Routine  to  do  EM  estimation  ## 

##  Compute  Estimates  using  EM  with  function  emK  ## 

***## 


emK<-function(x, means, sds,probs)  { 
n  <-  length(x) 

k  <-  length(means) 

MAXiter<-500  #Maximum  number  of  iterations 
numITS  <-  0;  ERR  <-  1 

##  Create  n  by  k  matrix  for  posterior  probabilities  ## 

TX  <-matrix(0,n,k) 

while  ((ERR  >  .00005)  &  (numITS  <  MAXiter))  { 
numITS  <-  numITS  +  1 
oldmeans  <-  means 

***## 

##  Compute  column  numerators  for  TX,  n  by  k  posterior  probabilities  matrix  ## 

***## 


for  (i  in  1  :k)  { 

TX[,i]  <-  probs[i]*dnorm(x,means[i],sds[i]) 

} 

TXrowsum  <-  apply (TX,1,  sum) 

##  Now  divide  by  row  sums  ## 

TX  <-TX/TXrowsum 
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##Update  probabilities,  means,  sds## 


for  (i  in  1  :k)  { 

probs[i]  <-mean(TX[,i]) 

means  [i]  <-sum(TX[,i]  *x)/i (n*probs  [i]) 

sds[i]  <-sqrt(sum(TX[,i]*(x-means[i])A2)/(n*probs[i])) 

} 

ERR  <-  sum((oldmeans-means)A2) 

} 

print("Number  of  Iterations,  Convergence  Error");print(c(numITS,ERR)) 
retum(list(means,sds,probs))  ##  Evidently  R  not  happy  about  returning 
values  this  way  ## 

} 


***## 

##  End  of  function  emK  ## 

***## 

##  MAIN  PROGRAM  ## 

***## 


n  <- 200 

means<-c(5,10,12,15) 
sds<-c(2,4,4,2) 
probs<-c(.3,.2,.3,.2) 
k  <-  length(means) 

x  <-  KmixGenerate(n, means, sds, probs) 

#print(x) 

startvals  <-  StartMixK(x,k) 

EMestimates  <-  emK(x,startvals[[l]],startvals[[2]],startvals[[3]]) 
print("TRUE  MEANS,  SDS,  &  PROBS  WITH  EM  ESTIMATES") 
print(means) 

print(estmeans<-EMestimates[[  1  ]]) 
print(sds) 

print(estsds<-EMestimates[[2]]) 

print(probs) 

print(estprobs<-EMestimates  [  [3  ]  ] ) 
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R/S/Plus  Program  to  Track  with  Unscented  Kalman  Filter 


ukf  <-  function(N,x0=10,alpha=0.5,beta=25,gamma=8,sdl=1.732051,std2=1.0)  { 

# 

#N  The  number  of  time  samples. 

# 

# 

#To  Load  into  Memory:  source("/home/...path-to-program.../ukf.R") 

# 

#To  Run:  ukf(100) 

#Generate  state  and  observation  values:  MM[,1]  and  MM[,3],  respectively. 
#State  and  observation  noise  also  generated:  MM[,2]  and  MM[,4],  respectively. 
MM  <-  processesTruth(alpha,  beta,  gamma,  sdl,  sd2,  xO,  N) 


lx  <-  1  #Length  of  process  x 
lv  <-  1  #Length  of  noise  v 

ly  <-  1  #Length  of  observation  process 
In  <-  1  #Length  of  observation  noise  n 

X_x  <-  matrix(0,lx,2*lx+l) 

X_v  <-  matrix(0  ,lv,2  *  lx+ 1 ) 

X_n  <-  matrix(0,ln,2*lx+l) 

x  <-  MM[1,1] 
y  <-  MM[1,3] 
v  <-  MM[1,2] 
n  <-  MM[1,4] 

myPts  <-  x 

P_k  <-  (sdl*sdl)*diag(lx) 

W  <-  numeric(2*lx  +1) 

X  <-  matrix(0,lx,2  *lx+ 1 ) 

Y  <-  matrix(0,ly,2*lx+l) 

K  <-  matrix(0,lx,ly) 


W[l]  <-  .5 

for  (i  in  2:  (2*  lx  +  1))  {  #Weights 
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W[i]  <-(l  -  W[l])/(2*lx) 

} 


for  (kk  in  1  :N)  { 

# 

# 

X[,l]  <-  x 
j<-0 

for  (i  in  2:(2*lx+l))  { 

j  <- j  +  1 

X[,i]  <-  x  +  (-l)**(j+l)  *  sqrt(l/(2*W[i])*P_k) 

} 


for  (i  in  l:(2*lx  +  1))  { 

X_x[l:lx,i]  <-  X[l:lx,i] 

} 

X  x  <-  systemProcess  noNoise(X  x, lx, kk, alpha, beta, gamma) 

x_mean  <-  matrix(0,lx,l) 

dim(x  mean)  <-  c(length(x_mean),l) 

for  (i  in  l:(2*lx  +  1))  { 

x_mean  <-  x_mean  +  W[i]*X_x[l:lx,i] 

} 


P_kMean  <-  sdl*sdl  *  diag(lx) 
for  (i  in  l:(2*lx  +  1))  { 

PkMean  <-  P  kMean  +  W[i]  *  (X_x[l:lx,i]  -  x_mean)  %*%  t(X_x[l:lx,i]  - 
x_mean) 

} 


Y  <-  observationProcess  noNoise(Y,X  x, lx, ly, alpha, beta, gamma) 


y  mean  <-  matrix(0,ly,l) 
dim(y_mean)  <-  c(length(y_mean),l) 
for  (i  in  l:(2*lx  +  1))  { 

y_mean  <-  y_mean  +  W[i]*Y[l:ly,i] 

} 
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P_yy  <-  sd2*sd2  *  diag(ly) 
for  (i  in  l:(2*lx  +  1))  { 

P_yy  <-  P_yy  +  W[i]  *  (Y[l:ly,i]  -  y_mean)  %*%  t(Y[l:ly,i]  -  y_mean) 

} 


P_xy  <-  matrix(0,lx,ly) 
for  (i  in  l:(2*lx  +  1))  { 

P_xy  <-  P_xy  +  W[i]  *  (X_x[l:lx,i]  -  x_mean)  %*%  t(Y[l:ly,i]  -  y_mean) 

} 


K  <-  P_xy [  1,1]/  P_yy[  1,1]  #P_xy  %*%  ginv(P_yy) 
x  <-  x_mean  +  K  %*%  (MM[kk,3]  -  y_mean) 

P_k  <-  P  kMean  -  K  %*%  P_yy  %*%  t(K) 

myPts  <-  c(myPts,x) 

} 

errMat  <-  numeric(N) 

MSE  <-  0 
for  (k  in  1  :N)  { 

errMat[k]  <-  abs(MM[k,l]  -myPts[k]) 

MSE  <-  MSE  +  errMat[k]  *errMat[k] 

} 

MSE  <-  MSE/N 
print(MSE) 

timeStamps  <-  l:length(myPts) 

plot(timeStamps,  myPts, ylab="Average  Particle  Pos.",  type=T) 

} 


processesTruth  <-  function(alpha,  beta,  gamma,  sdl,  sd2,  xO,  N)  { 

# 

# 

# 

#  M[,l] ...  the  state  values: 

#  M[,2] ...  the  state  noise: 

# 

#  M[,3] ...  observation  values: 

#  M[,4] ...  observation  noise: 

# 

# 

M  <-  matrix(0,N,4) 

M[,2]  <-  morm(N,0,sdl)  #Process  Noise 
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} 


M[,4]  <-  morm(N,0,sd2)  Observation  Noise 

# 


M[l,l]  <-  xO 
for  (n  in  2:N)  { 

M[n,l]  <-  alpha  *  M[n-l,l]  + 

beta  *  M[n-l,l]/(l  +  M[n-l,l]*M[n-l,l])  + 
gamma  *  cos(1.2*n)  +  M[n,2]  #Process 
M[n,3]  <-  M[n,  1  ]  *M[n,  1  ]/20.0  +  M[n,4]  Observation 

} 

timeStamps  <-  1  :N 
#split.screen(c(2,l)) 
layout(matrix(c(l,2,3),  3,  1)) 

#screen(l) 

#erase.screen() 

plot(timeStamps,  M[,l],ylab="True  Particle  Position" ,type=T) 
return(M) 


systemProcess_noNoise  <-  function(X_x, lx, kk, alpha, beta, gamma)  { 

# 

# 

for  (n  in  1  :(2*lx  +  1))  {  #Process 

X_x[l:lx,n]  <-  alpha  *  X_x[l:lx,n]  + 

beta  *  X_x[l:lx,n]/(1  +  X_x[l:lx,n]*X_x[l:lx,n])  + 
gamma  *  cos(1.2*kk) 

} 

retum(X_x) 


observationProcess  noNoise  <-  function(Y,X_x, lx, ly, alpha, beta, gamma)  { 

# 

# 

for  (n  in  1  :(2*lx  +  1))  {  Observation 

Y[l:ly,n]  <-  X_x[l:lx,n]*X_x[l:lx,n]/20.0 

} 

retum(Y) 


END  OF  CODE 
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APPENDIX  B 


Graduate  Students  Presentation  at  NUWC 


In  the  summer  of  2004  the  two  graduate  students,  Ms  Latoya  Silochan  and  Ms  Kashonda 
Bynum  accompanied  Dr.  Warrack  for  a  10  week  internship  at  NUWC-Newport,  where 
they  worked  under  Dr  Roy  L.  Streit.  At  the  end  of  the  internship  the  students  made  a 
presentation  to  selected  member  of  Code  22  at  NUWC  entitled  “Applying  Density 
Estimation  Techniques  to  Tracking  Problems”.  Frequency  Azimuth  (“FRAZ”)  data  was 
simulated,  and  various  different  kernel  density  estimators  were  used.  The  effect  of 
different  bandwidths  was  also  considered. 


Applying  Density  Estimation  and  Nonparametric 
Smoothing  Techniques  to  Tracking  Problems 

A.G.  Warrack.  K.  Bynum,  and  L,  Silochan 
July  23,  2004 


1  Introduction 

We  consider  tracking  problems  in  which  multiple  targets  are  observed  over  time. 
At  each  time  period  beam  measurements  are  observed,  being  the  sum  of  mea¬ 
surements  over  frequency  bins  (FRAZ  or  Frequency  Azimuth  data).  Data  was 
simulated  for  two  scenarios.  In  the  first  the  number  of  targets  is  fixed  at  three1 
and  the  trajectories  cross  at  the  halfway  point.  In  the  second  there  are  initially 
three  targets,  with  a  fourth  entering  at  the  halfway  point.  It  is  assumed  that 
the  number  of  targets  is  unknown,  and  that  this  is  a  parameter  to  be  estimated, 
along  with  the  target  trajectories. 

It  is  assumed  that  the  data  generated  by  k  targets  at  time  £  follows  a  mixture 
of  k  Gaussian  distributions 


k 

i=  1 


i(^)a 


(i) 


If  the  number  of  mixtures;  k  is  given,  a  well  known  algorithm  for  estimating 
the  and  pi  is  the  E-M  (Expectation-Maximization)  Algorithm,  In  this 

paper  we  attempt  to  see  if  k  and  the  /!*(£)  can  be  estimated  by  smoothing  the 
data  using  a  kernel  density  estimate  at  each  time  point;  and  then  smoothing  the 
modal  points  by  means  of  nonparametric  smoothing  fe.g.  splines)  to  estimate 
the  target  trajectories.  We  also  attempt  to  test  hypotheses  about  the  number 
of  targets  at  any  times  point  by  using  a  bootstrap  test  based  on  the  minimum 
bandwidth. 


2  Density  Estimation 

There  are  two  main  components  in  the  estimation  of  probability  sdensity  func¬ 
tions,  the  selection  of  a  kernel  function,  and  the  selection  of  a  bandwidth. 
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2.1  Kernel  Functions 

A  kernel  density  estimate  for  a  probability  density  function  /(x)  is  a  function 
of  the  form 


i  ^ 

£ 

CWI 

-IS 

II 

(2) 

for  a  sample  A  i,  A3, ...,  An,  a  fixed  kernel  function  K{),  and  a  bandwidth  ft.  The 
kernel  is  normally  chosen  to  be  a  probability  density  function  with  f  xK(x)dx  = 
0  and  /  Ea  A(x)dx  =  <  00.  Two  common  choices  for  K()  are  the  Gaussian 

kernel 

A'(x)  = 

and  the  Epanechnikov  kernel 

A(jf)  =  ^  (l  —  xs/5)  /v^&,  M  < 

K (x)  =  0,  |x|  > 

(4) 

When  using  the  Gaussian  kernel  one  may  think  of  the  estimator  in  the  words  of 
Efron  and  Tibshirani  as  "adding  up  n  little  Gaussian  density  curves  centered  at 
each  point  e*  each  having  standard  deviation  ft".  It  is  generally  acknowledged 
that  while  the  selection  of  a  kernel  function  is  not  important  in  estimating  a 
density,  the  selection  of  the  bandwidth,  ft.,  is  more  crucial  If  ft  is  chosen  too 
small,  /(x)  will  have  low  bias,  that  is  E[/(x)J  will  be  close  to  /(x),  but  high 
variance,  but  if  ft  is  chosen  too  large  this  situation  will  be  reversed.  Ideally  ft 
should  be  chosen  to  minimize  the  mean  sqare  error  (MSE) ,  which  is  the  sum  of 
the  variance  and  the  square  of  the  bias.  Two  other  noteworth  facts  regarding 
the  bandwidth  are: 

1) .  the  number  of  modes,  m,  of  /(x)  is  a  monotonically  decreasing  function  of 
ft 

2) .  If  hm  is  the  smallest  bandwidth  producing  a  density  estimator  with  m  modes, 
then  the  log- likelihood  £"=1  /(A,;  AT,  ftm)  is  maximized  over  all  ft  producing 
density  estimates  with  m  modes 

Another  way  of  looking  at  ftm  is  to  think  of  it  as  the  bandwidth  that  procures 
m  modes  with  the  least  amount  of  smoothing, 

2.2  Bandwidth  Selection 

One  way  to  automatically  compute  the  bandwidth  is  based  on  computing  the 
mean  integrated  square  error  (MISE)  here  the  expectation  is  taken  with  regard 
to  ft) 


MSSE  =  E  j | /(x;  ft)  -  f{x)  2dx  =  J  E|/(x;  ft)  -  /(e)  2dx 
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and  choosing  the  smallest  value  as  a  function  of  h.  We  can  expand  MISE  as 


VISE- Bf  w*- nHw+ftW* 


The  third  term  is  a  constant  and  can  he  dropped 

One  approach  is  to  make  an  asymptotic  expansion  of  the  MISE  of  the  form 


MISE  = 


+  0(l/rah  +  hd) 


where  A'2  is  the  convolution  of  the  kernel  function  with  itself.  If  we  neglect  the 
remainder,  the  optimal  bandwidth,  the  one  minimizing  MISE ,  would  be 


^optimal 


Jk 3 


1  i/8 


B/(n{/*3*rJ 


(5) 


Of  course  the  function  on  the  right  hand  side  of  {4)  involves  the  unknown  density 
function  /.  One  solution  to  this  problem, generally  known  as  the  "plug-in" 
method,  is  to  choose  a  pilot  bandwidth,  fr,  and  approximate  /"  with 


(«) 


Of  course  this  merely  replaces  one  problem  with  another,  and  estimates  of  /" 
can  vary  greatly  according  to  the  choice  of  k.  She  at  her  and  Jones  come  up 
with  a  commonly  use  solution  solution  assuming  that  /j(h)  =  CTi3'7,  where  C 
depends  on  the  data,  but  not  on  h. 


3  Testing  for  the  Number  of  Modes 

Suppose  we  wish  to  test  for  number  of  modes,  m,  of  a  probability  distribution. 
Specifically  consider  the  hypotheses: 

Ho  ■  rtmodeif)  =  m  vs  Ha  :  nmode(/)  >  m  (7) 

Where  nn;ode(/)  is  the  number  of  mixtures,  or  modes  in  the  density, We  compute 
a  nonpar  arnetr ic  kernel  density  estimate  based  on  the  observed  data  X  i ,  X2 , ... ,  Xn 

It  may  be  shown  that  the  number  of  modes  of  fh^hiv)  decreases  monotomcally 
as  h  increases.  Let  be  the  minimal  bandwidth  for  which  }K,Hm  (^0  has  m 
modes.  That  is 

JWd e(fK,Hm(x))  -  m  and  nmode(fK,h(x))  >mih>  Hm  (8) 


3 


23 


Since  a  density  with  a  high  number  of  modes  needs  a  higher  degree  of  smoothing 
(i.e.  using  a  larger  bandwidth)  to  produce  a  density  estimate  with  a  smaller 
number  of  modes ,  we  may  use  the  minimal  bandwidth  as  a  test  statistic.  Thus 
if  were  the  observed  minimal  bandwidth  for  m  modes,  we  could  reject  the 
null  hypothesis  at  significance  level  a  if  the  p-value  is  smaller  than  a,  that  is 


>  tom)  ^ 


A  high  minimal  bandwidth  km  would  indicated  that  the  data  has  to  undergo  a 
large  degree  of  smoothing  to  produce  a  kernel  density  estimate  with  m  modes 
The  sampling  distribution  of  H,n  is  unknown  However  we  may  approximate 
the  p-value  by  using  the  parametric  bootstrap  in  which  we  repeat dly  resample 
(with  replacement)  from  the  data,  and  approximate  the  p-value  as  follows  by 
the  proportion  of  times  the  resampled  data  produces  a  minimal  bandwidth  for 
m  modes,  which  is  larger  than  the  one  we  observed.  The  steps  of  the  algorithm 
may  be  stated  as  follows: 

1.  Draw  B  bootstrap  samples  of  size  n  from  /(.; 


2.  For  each  sample  compute  the  smallest  bandwidth  that  produces  m 
modes,  obtaining  h*m{  1),  h^(2), h^(B) 


2. 


Approximate  p-value  with  5Z™=1 1 


/B 


The  first  step  may  be  done  as  follows:  Let  be  a  sample  taken  with 

replacement  from  the  data  as  1,3:2, ...,  xn.  Now  set 


=  y*  +  fi  +  -y  +  1  =  1,2 . n 


(9) 


where  y*  is  the  mean  of  the  y*,  a2  is  the  sample  estimate  of  the  variance  of 
the  x*,  and  the  are  standard  normal  random  variables.  It  was  hoped  that 
this  test  might  serve  to  estimate  the  number  of  targets,  however  it  proved  to 
be  computationally  incredibly  slow  on  the  computers  available,  and  therefore  of 
little  practical  use.  Computer  programs  were  WTitten  to  compute  the  minimum 
bandwidth  hm  givind  a  smoothed  estimate  of  m  modes  at  each  time  point  (see 
tables  1  and  2). 


4  Bayesian  Estimation  of  Beam  probabilities 

At  eact  time  point,  t,  and  at  each  beam,  &,  we  wish  to  estimate  a  posterior  sig¬ 
nal  probability  0*  *,  based  on  the  observed  measurement  X and  the  previous 
estimate  We  use  the  relationship  between  the  Dirichlet  and  Multinomial 

distributions:  The  Dirichlet  density  is  a  fc-dimensional  version  of  the  beta  den¬ 
sity.  If  x  =  (xj.,  xj, ...,  Jr*),  a  —  (ofi,Of2“'>  **)  and  A  =  a*,  the  Dirichlet 

density  is 

=  ^^-yX?1_1X?a_1...x“,f_11Xi  >  0,QT(  >  =  1 
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where  I  (t)  is  Euler’s  gamma  function 

ipDO 

r(f)  =  /  f  >  0 


The  means,  variances,  and  covariances  of  the  are  EfJY*)  =  ajAt  Var{X s)  = 

— ~TT‘-  aiKl  Coo{XtXj)  =  — awa**  i •>  ^  we^  known  that  the  Dirich- 
A  (A  "hi)  A  (A  +  1) 

let  distribution  is  a  conjugate  prior  for  the  multinomial  distribution.  Let  Y  = 

(Yi ,  Yz, ... ,  Y*)  have  multinomial  distribution  with  parameters  n,  ,  02...0fc,£*=1  Y<  = 

n  and  £*=1  9i  =  1.  If  9  =  (0i,03...,0fc)  has  prior  Dirichlet  distribution  with  pa¬ 
rameters  q  i ,  t>2 . . . ,  ,  then  the  posterior  distribution  of  9\Y  =  y,  y  =  (yiiy2i 
is  Dirichlet  with  parameters  yi  +  +  02...,Ste  -h  9k  Dirichlet  random  vector 

may  be  simulated  in  the  following  way: 

To  generate  random  vector  9  =  (#i ,  02 having  a  Dirichlet  distribution  with 
parameter  a  =  (ori,aa..,,Qfc): 

1).  Generate  Jtj.,  Jfa, A*  independent  from  gamma  desities 


/(*)  = 


e-sos“<-1 

rf«i) 


1  <  i  <  k 


2).  Set  9i 


Xi 

Xi.  +  X2  +  :.+Xk 


1  <i<k 


5  Estimation  of  Target  Trajectories 

At  time  t,  t  =  1,  2,...,T,  we  observe  histogram  count  Xt^  at  beam  b,  b  = 
1, 2, B,  .  The  counts  are  generated  from  a  mixture  of  k  targets,  according  to 
Gaussian  distributions. 


/(a)  =  ^Pi' 


£=1 


■'2 


l  (  *-Pi\ ' 

5  2  n  J 


(10) 


Let  n  be  the  total  number  of  observations  also  the  sum  of  the  beam  counts 
at  time  t,  ££=i  A-*,*  =  n. 


1).  Let  pt-i,i,Pt-i,2i  be  estimates  of  the  histogram  probabilities 

at  time  t  -  1,  £  f=1  p*_u>  =  1 


2).  Using  the  relationship  between  the  Dirichlet  distribution  and  the  multi¬ 
nomial  distribution  use  the  histogram  data  at  time  f  to  and  the  probabilities 
pt—ij  obtain  posterior  probabilities 


P*<t>  = 


+  Af>& 

1 


(11) 
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3) .  Generate  '’posterior1’  histogram  counts  according  to  a  Multinomialfn,  ) 

distribution 

4) .  Smooth  these  counts  using  a  Gaussian  kernel  density  estimator,  and  use  the 
smoothed  data  to: 

a) .  Obtain  updated  beam  probabilities  pt,uPt,2i  —  iPt.B 

b) .  Obtain  estimates  of  the  number  of  targets  and  the  position  of  the  targets 
at  time  t  from  the  number  and  position  of  the  modes  of  the  smoothed  distribu¬ 
tion 

5) .  Use  a  spline  to  smooth  the  point  estimates  in  cases  where  a  track  can  be 
discriminated 


6  Data  Simulation 

Two  scenarios  were  simulated.  In  one  in  which  three  target  start  out  in  from 
separate  positions,  and  then  cross  over.  In  the  second  there  are  initially'  three 
targets,  which  are  joined  by  a  fourth.  There  is  no  crossing  of  paths.  The  results 
of  simulations  are  shown  in  figures  1  through  5. 
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TABLE  I 


Tracking  3  targets  aver  &0  time  points  fbwminS  output)  BW-SJ  is  adaptive 
bandwidth,  modes  is  estimated  number  of  targets  at  time  t,  h(i)  is  minimum 
bandwidth  producing  i  modes  at  time  t 


time 

BW-SJ 

moles  h(l) 

h<2) 

h(3) 

h{4) 

h{5) 

[1,] 

0. 

.616 

3 

5 

.000 

2 

.607 

0. 

.397 

0 

.330 

0. 

301 

[2t] 

0. 

.637 

3 

5 

.000 

2 

.406 

0. 

.445 

0 

.353 

0. 

320 

[3,] 

0. 

.646 

3 

5 

.000 

2 

.454 

0. 

.272 

0 

.263 

0. 

253 

[4,] 

0. 

.626 

3 

5 

.000 

2 

.133 

0. 

.406 

0 

.320 

0. 

282 

[5,] 

0. 

.590 

3 

5 

.000 

1 

.966 

0. 

.363 

0 

.282 

0. 

253 

[6,] 

0. 

.574 

3 

5 

.000 

1 

.832 

0. 

.339 

0 

.282 

0. 

272 

[7t] 

0. 

.569 

3 

4 

.636 

1 

.804 

0. 

.492 

0 

.473 

0. 

349 

[8J 

0. 

.599 

3 

5 

.000 

1 

.507 

0. 

.291 

0, 

.282 

0. 

263 

[9,] 

0. 

.560 

3 

5 

.000 

1 

.354 

0. 

.291 

0 

.272 

0. 

244 

[10  J 

0. 

.547 

3 

4 

.713 

1 

.067 

0. 

.291 

0 

.253 

0. 

244 

[11,] 

0. 

.556 

3 

4 

.761 

1 

.019 

0. 

.291 

0 

.244 

0. 

234 

[12,] 

0. 

.525 

3 

4 

.282 

0 

.579 

0. 

.416 

0 

.416 

0. 

378 

[13,] 

0. 

.529 

2 

4 

.148 

0 

.378 

0. 

.311 

0 

.263 

0. 

253 

[14,] 

0. 

.517 

2 

4 

.081 

0 

.330 

0. 

.330 

0 

.263 

0. 

263 

[15,] 

0. 

.480 

2 

3 

.909 

0 

.272 

0. 

.215 

0 

.196 

0. 

186 

[16,] 

0. 

.464 

2 

3 

.497 

0 

.282 

0. 

.263 

0, 

.244 

0. 

244 

[17,] 

0. 

.445 

3 

3 

.363 

0 

.531 

0. 

.339 

0 

.234 

0. 

234 

[18,] 

0. 

.452 

2 

3 

.124 

0 

.435 

0. 

.425 

0 

.311 

0. 

253 

[19,] 

0. 

.416 

3 

2 

.895 

0 

.521 

0. 

.311 

0, 

.244 

0. 

234 

[20,] 

0. 

.410 

2 

2 

.751 

0 

.263 

0. 

.244 

0 

.205 

0. 

196 

[21,] 

0. 

.415 

2 

2 

.502 

0 

.311 

0. 

.291 

0 

.234 

0. 

215 

[22,] 

0. 

.402 

2 

2 

.196 

0 

.215 

0. 

.196 

0 

.186 

0. 

157 

[23,] 

0. 

.430 

2 

1 

.564 

0 

.272 

0. 

.234 

0 

.224 

0. 

196 

[24,] 

0. 

.465 

2 

1 

.449 

0 

.387 

0. 

.349 

0 

.301 

0. 

263 

[25,] 

0. 

.441 

2 

0 

.885 

0 

.263 

0. 

.244 

0 

.224 

0. 

215 

[26,] 

0. 

.415 

2 

0 

.569 

0 

.311 

0. 

.253 

0 

.234 

0. 

234 

[27,] 

0. 

.394 

2 

0 

.483 

0 

.320 

0. 

.311 

0 

.244 

0. 

224 

[28,] 

0. 

.448 

1 

0 

.311 

0 

.301 

0. 

.282 

0 

.263 

0. 

215 

[29,] 

0. 

.460 

1 

0 

.368 

0 

.320 

0. 

.301 

0 

.291 

0. 

224 

[30,] 

0. 

.354 

2 

0 

.378 

0 

.320 

0. 

.253 

0 

.224 

0. 

186 

[31,] 

0. 

.346 

2 

0 

.674 

0 

.291 

0. 

.263 

0 

.244 

0. 

234 

[32,] 

0. 

.367 

3 

0 

.875 

0 

.397 

0. 

.205 

0 

.196 

0. 

186 

[33,] 

0. 

.357 

2 

1 

.354 

0 

.301 

0. 

.272 

0 

.244 

0. 

196 

[34,] 

0. 

.348 

3 

1 

.708 

0 

.397 

0. 

.301 

0 

.224 

0. 

215 

[35,] 

0. 

.374 

2 

1 

.966 

0 

.301 

0. 

.272 

0 

.253 

0. 

244 

[36,] 

0. 

.451 

2 

2 

.119 

0 

.353 

0. 

.320 

0 

.311 

0. 

311 

[37,] 

0. 

.465 

2 

2 

.292 

0 

.282 

0. 

.263 

0 

.263 

0. 

234 

[38,] 

0. 

.490 

3 

2 

.359 

0 

.789 

0. 

.492 

0 

.330 

0. 

291 

[39,] 

0. 

.574 

3 

2 

.483 

0 

.732 

0. 

.373 

0 

.301 

0. 

282 

[40,] 

0. 

.463 

3 

2 

.962 

1 

.306 

0. 

.425 

0 

.339 

0. 

301 

( 
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[41,]  0.442 
[42,]  0.468 
[43,]  0.517 
[44,]  0.527 
[45,]  0.584 
[46,]  0.578 
[47,]  0.584 
[48,]  0.643 
[49,]  0.642 
[50,]  0.664 


4 

3 

3 

3 

3 

3 

3 

3 

3 

3 


3.134  1.574  0.492 
3.182  1.804  0.311 
3.450  2.186  0.464 
3.593  2.359  0.311 
3.746  2.732  0.272 
3.813  2.981  0.311 
3.986  3.373  0.349 
4.215  3.679  0.291 
4.359  3.871  0.636 
4.483  4.139  0.272 


0. 

349 

0. 

282 

0. 

30  i 

0. 

282 

0. 

373 

0. 

301 

0. 

30  i 

0. 

263 

0. 

253 

0. 

253 

0. 

282 

0. 

282 

0. 

320 

0. 

311 

0. 

263 

0. 

244 

0. 

363 

0. 

339 

0. 

244 

0. 

196 

S 
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TABLE  2 


Tracking  3  targets  over  t=l,.„, 25,  4  targets  over  t=26,.,,,5U  tbwminb  output) 


BW-SJ  is  adaptive  bandwidth,  modes  is 
h{i)  is  minimum  bandwidth  producing  i 


time 

BV- 

-SJ 

modes  h(l) 

h<2) 

[1,] 

0. 

.440 

1 

0 

.368 

0 

.272 

[2,] 

0. 

.487 

1 

0 

.378 

0 

.349 

[3,] 

0. 

.484 

1 

0 

.349 

0 

.320 

[4,] 

0. 

.496 

1 

0 

.307 

0 

.311 

[5,] 

0. 

.509 

1 

0 

.397 

0 

.330 

[6,] 

0. 

.495 

2 

0 

.502 

0 

.425 

[7,3 

0. 

.456 

2 

0 

.693 

0 

.373 

[8,] 

0. 

.490 

2 

0 

.569 

0 

.272 

[9,] 

0. 

.552 

2 

0 

.722 

0 

.320 

[10,] 

0. 

.536 

1 

0 

.397 

0 

.397 

[11,3 

0. 

.474 

2 

0 

.550 

0 

.368 

[12,3 

0. 

.470 

2 

0 

.531 

0 

.416 

[13,3 

0. 

.501 

2 

0 

.550 

0 

.339 

[14,] 

0. 

.474 

2 

0 

.732 

0 

.445 

[15,3 

0. 

.501 

3 

0 

.952 

0 

.521 

[16,3 

0. 

.454 

2 

1 

.028 

0 

.339 

[17,3 

0. 

.501 

2 

1 

.564 

0 

.473 

[18,3 

0. 

.453 

3 

1 

.101 

0 

.540 

[19,3 

0. 

.440 

3 

1 

.536 

0 

.636 

[20,3 

0. 

.481 

3 

1 

.593 

0 

.732 

[21,3 

0. 

.455 

3 

1 

.583 

0 

.559 

[22,3 

0. 

.442 

3 

1 

.737 

0 

.875 

[23,3 

0. 

.471 

4 

1 

.899 

0 

.713 

[24,3 

0. 

.475 

3 

1 

.985 

0 

.913 

[25,3 

0. 

.461 

4 

2 

.043 

0 

.894 

[26,3 

0. 

.619 

4 

3 

.201 

2 

.359 

[27,3 

0. 

.636 

4 

3 

.478 

2 

.062 

[28,3 

0. 

.617 

4 

3 

.201 

1 

.985 

[29,3 

0. 

.646 

3 

3 

.392 

2 

.133 

[30,3 

0. 

.639 

3 

3 

.852 

2 

.024 

[31,3 

0. 

.646 

3 

3 

.641 

1 

.933 

[32,3 

0. 

.642 

3 

3 

.899 

1 

.909 

[33,3 

0. 

.626 

3 

3 

.756 

2 

.081 

[34,3 

0. 

.627 

3 

4 

.215 

2 

.167 

[35,3 

0. 

.667 

3 

4 

.148 

2 

.043 

[36,3 

0. 

.638 

3 

4 

.493 

2 

.043 

[37,3 

0. 

.660 

3 

4 

.292 

2 

.153 

[38,3 

0. 

.652 

3 

4 

.531 

2 

.043 

[39,3 

0. 

.640 

3 

4 

.694 

1 

.985 

[40,3 

0. 

.649 

3 

4 

.789 

2 

.153 

9 


estimated  number  of  targets  at  time  t, 
modes  at  time  t 

h{3)  h(4>  h{5) 

0.234  0.224  0.186 
0.272  0.234  0.224 
0.263  0.263  0.224 
0.291  0.263  0.234 
0.330  0.263  0.224 
0.272  0.244  0.234 
0.349  0.272  0.263 
0.272  0.244  0.244 
0.320  0.291  0.244 
0.320  0.282  0.234 
0.358  0.272  0.224 
0.301  0.263  0.253 
0.253  0.253  0.215 
0.406  0.311  0.282 
0.320  0.291  0.253 
0.301  0.224  0.224 
0.320  0.282  0.234 
0.358  0.282  0.282 
0.301  0.282  0.253 
0.330  0.311  0.301 
0.330  0.301  0.244 
0.349  0.301  0.224 
0.579  0.339  0.339 
0.425  0.358  0.282 
0.531  0.368  0.301 
1.028  0.358  0.244 
0.904  0.358  0.291 
0.636  0.425  0.406 
0.550  0.349  0.339 
0.607  0.311  0.311 
0.464  0.330  0.311 
0.454  0.406  0.282 
0.550  0.387  0.244 
0.358  0.330  0.272 
0.397  0.349  0.320 
0.349  0.282  0.272 
0.291  0.291  0.282 
0.378  0.311  0.282 
0.358  0.339  0.301 
0.263  0.263  0.244 


30 


[41,] 

0.689 

3 

5.000 

[42  J 

0.693 

4 

5.000 

[43  J 

0.708 

3 

5.000 

[44 , ) 

0.735 

3 

5.000 

[45  J 

0.735 

3 

5.000 

[46,] 

0.769 

3 

5.000 

[47,] 

0.772 

3 

5.000 

[48,] 

0.768 

3 

5.000 

[49,] 

0.796 

3 

5.000 

[50,] 

0.833 

3 

5.000 

2.158  0. 540  0.387  0.330 
2.234  0.894  0.368  0.301 
2.339  0.358  0.320  0.234 
2.196  0.311  0.282  0.215 
2.071  0.320  0.311  0.291 
2.387  0.425  0.358  0.301 
2.493  0.291  0.272  0.253 
2.512  0.301  0.263  0.263 
2.368  0.502  0.358  0.339 
2.435  0.339  0.291  0.291 
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Figure  1.  Tine  target  positions  are  red,  blue,  green,  and  orange  lines.  Dots 
are  modal  points  of  density  estimators  at  each  time  period.  Number  of  targets 
changes  from  3  to  4. 
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Target  Ship  Pasitisns  aim  Estimates 


Figure  2,  True  target  positions  are  red,  blue  and  green  lines.  Dots  are  modal 
points  of  density  estimators  at  each  time  period.  Number  of  targets  constant 
at 
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Target  Ship  Positions  and  Bayea  Eatimaea 
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Figure  3.  True  target  positions  are  red,  blue,  green,  and  orange  lines.  Dots 
are  modal  points  of  density  estimators  at  each  time  period  using  Bayesian  up¬ 
dates.  Number  of  targets  changes  from  3  to  4. 
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Target  Ship  Positions  and  Bayea  Eatimaea 
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Figure  4.  True  target  positions  are  red.  blue  and  green  lines.  Dots  are  modal 
points  of  density  estimators  at  each  time  period  using  Bayesian  probability 
updates.  Number  of  targets  constant  at  3 
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Tjpget  Ship  P as itla rra  a nd  Spline  Estlm*»a 
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Introduction 


In  this  paper  we  consider  the  problem  of  tracking  two  targets,  and  maintaining  the 
separate  track  when  either 

•  The  two  tracks  cross, 

•  The  two  tracks  approach  each  other,  then  after  passing  very  close,  diverge. 

The  data  used  was  simulated  using  the  R  statistical  programming  language.  R  is  the  open 
source  version  of  the  S-Plus  programming  language  developed  at  Bell  Labs  by  John 
Chambers. 


Type  of  Data 

At  each  time  point,  t,  t=l,2,  we  observe  data  in  the  form  of  a  sample  of  n 
observations  xl,x2,...,xn .  These  data  are  generated  according  to  a  “mixture”  of  k normal 

distributions,  each  with  mean  and  standard  deviation  (//  .,cr  .),  j=l,2,...,k  .  Each 

observation,  xt ,  is  sampled  with  probability  Pj  from  distribution  j,  where 

P\  +  Pi  +  •••  +  Pk  =  1  -  It  can  be  shown  that  the  probability  distribution  for  each  xt  is 

k 

f(.x)  =  Y,Pj<l>(.XpPpVj) 

j= i 

Where  <j>(xi ;  p. ,  u] )  is  the  normal  density  with  mean  and  standard  deviation  ( pj ,  ov ) . 

The  means  at  time  t  represent  the  respective  positions  of  k  targets.  In  fact  we  should  write 
jUj(t) ,  representing  the  true  position  of  target  number  j  at  time  t.  In  this  presentation  we 

will  assume  the  number  of  targets,  k ,  is  known. 


Estimation  of  Target  Parameters 

The  parameters  to  be  estimated  at  each  time  point  t,  are  the  means  (positions),  standard 
deviations  and  probabilities.  Clearly  the  positions  at  time  t  should  be  incorporated  into 
the  estimation  at  time  t+1,  since  we  assume  that  all  targets  move  in  some  reasonably 
smooth  trajectory. 

A  standard  statistical  tool  in  estimating  the  parameters  in  a  mixture  distribution  is  the 
Expectation-Maximization  Algorithm  (generally  known  as  the  EM  Algorithm),  which 
seeks  parameter  estimates  that  maximize  the  likelihood  function 

n  k 

L(p,  a,p-,x)  =  Y[Yj  PAxi  >  Pj  ’  aj ) 

i= 1  J=l 

This  is  an  iterative  algorithm,  which  requires  starting  values,  and  then  updates  the 
parameter  estimates  at  each  iteration.  Under  fairly  wide  conditions  this  algorithm  can  be 
guaranteed  to  converge.  It  is  notably  useful  in  estimation  for  situations  in  which  there  is 
missing  or  censored  data,  as  well  as  for  mixture  distributions. 

In  tracking  situations  it  is  sensible  to  incorporate  the  estimate  at  time  t,  or  some  function 
of  them,  to  use  as  the  initial  estimates  at  time  t+1. 
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In  this  study  we  consider  various  smoothing  methods,  both  parametric  and 
nonparametric,  which  we  incorporate  into  the  EM  algorithm  to  improve  target  estimates, 
and  to  maintain  target  identity. 


Smoothing 

The  concept  of  smoothing  refers  to  the  accurate  fitting  of  a  smooth  curve  to  a  set 
of  “noisy”  data  (e.g.  data  full  of  error).  A  smooth  estimate  is  an  extremely  accurate 
estimate  of  the  original  data  point  because  smooth  estimates  greatly  reduce  noise,  and 
help  to  prominently  reveal  the  characteristics  of  the  actual  trajectory  being  tracked.  In 
the  research  conducted  this  summer  (May- July  2005),  several  smoothing  techniques  were 
used  to  smooth  parametric  as  well  as  nonparametric  regression  estimates. 

Parametric  Regression 

The  parametric  case  is  one  in  which  the  expected  form  of  the  function  is 
known.  In  this  instance,  it  is  most  accurate  to  perform  a  linear/multiple  regression  to 
estimate  a  specific,  finite  number  of  unknown  parameters.  The  use  of  a  weighted  sum  of 
the  observations  to  retrieve  our  fitted  values  is  pertinent  to  this  case. 


Simple  Linear  Regression  Model 

The  first  case  explored  was  that  of  the  linear  regression  model.  The  linear 
regression  model  is  given  by: 

yt  =a  +  pxi+ei 

Where  a  is  the  y-axis  intercept,  [i  is  the  slope  (otherwise  known  as  the  regression 

coefficient),  and  the  si  ’s  are  the  corresponding  error  terms  for  each  x; .  These  error 
terms  are  considered  to  be  independent  and  normally  distributed  with  mean  zero  and 
standard  deviation,  <x2 .  The  method  of  least  squares  is  used  to  estimate  the 
parameters  a,  f3,<r 2 .  For  the  trajectories  that  were  studied,  there  was  no  need  to  use  a 
linear  regression  fit.  Though  one  of  our  trajectories  has  linear  properties,  it  is  more 
efficient  and  flexible  to  use  a  quadratic  multiple  regression  analysis.  This  approach  gives 
more  flexibility  because  the  quadratic  regression  model  can  track  quadratic  functions  and 
linear  functions,  whereas  the  simple  linear  regression  is  used  strictly  for  those  functions 
that  only  display  linear  characteristics. 

However,  if  we  apply  the  linear  regression  model  to  the  specific  trajectories  used, 
we  can  see  specifically  where  this  technique  fails. 
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Fig.  1:  Linear  regression 
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In  figure  1 ,  the  track  is  plotted  as  a  function  of  time.  The  solid  line  denotes  the  user 
defined  trajectory,  and  the  hollow  dots  represent  the  estimated  data  points  that  were 
calculated  and  smoothed  using  simple  linear  regression.  It  is  obvious  that  the  program 
confuses  the  two  tracks  when  they  intersect  the  second  time.  Since  the  linear  regression 
model  is  used  for  those  trajectories  that  display  only  linear  characteristics,  the  smoothing 
procedure  will  always  estimate  a  data  point  that  has  a  linear  relationship  with  the 
previous  estimate.  For  this  reason,  when  the  tracks  cross  the  second  time,  the  linear 
regression  smoother  wants  to  continue  in  a  positive  direction  for  the  green  track  and  a 
negative  direction  for  the  red  track,  thus  causing  the  tracks  to  switch.  This  is  not  the  case 
when  a  quadratic  multiple  regression  smoother  is  used. 

Nonparametric  Regression 

The  nonparametric  case  is  one  in  which  the  expected  form  of  the  function  is 
unknown.  In  this  instance,  one  must  use  an  alternate  way  of  determining  the  weights  to 
be  used  in  the  regression.  There  are  several  different  techniques  that  may  be  used.  This 
summer  three  of  these  techniques  were  used  (all  of  which  have  predetermined  functions 
in  R). 


Kernel  Regression 

The  Kernel  regression  smoother,  most  commonly  known  as  the  Nadaraya-Watson 
Kernel  Regression  Estimate,  is  used  to  determine  the  appropriate  weights  to  use  to  yield 
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fitted  values  of  a  data  set.  The  kernel,  K,  used  this  summer  was  that  of  the  normal 
density: 


K 


Using  this  kernel,  R  will  evaluate  the  appropriate  weights  for  each  of  n  data  points, 
assigning  more  weight  to  the  estimates  close  to  actual  values  and  less  weight  to  the 
estimates  farther  away  from  actual  values.  In  R,  the  ksmooth  function  represents  the 
kernel  regression. 

To  accurately  perform  this  regression  one  must  also  specify  a  bandwidth.  The 
bandwidth  is  used  to  determine  how  fast  the  weights  will  decrease  as  the  distance  from 
the  actual  value  increases.  The  choice  of  bandwidth  is  extremely  important,  as  this  value 
will  determine  how  smooth  the  fitted  values  will  be. 

For  example,  choosing  a  bandwidth  value  that  is  too  large  (close  to  the  actual 
sample  size)  will  result  in  an  over-smoothed  fit.  This  is  because  when  the  bandwidth  is 
large  the  weights  are  determined  at  a  large  number  of  points,  thus  they  are  virtually  equal. 
The  result  is  a  set  of  smooth  points  that  have  a  seemingly  linear  relationship  as  opposed 
to  a  relationship  that  closely  resembles  the  actual  trajectory.  This  is  shown  in  Figure  2 
where  the  bandwidth  is  set  at  30. _ 

Figure  2:  Large  Bandwidth 


On  the  other  hand,  it  is  also  possible  to  choose  a  bandwidth  that  is  too  small. 
When  this  occurs,  the  predicted  point  receives  the  most  weight.  In  this  case,  each  sample 
will  yield  a  different  fit  because  there  is  too  much  dependence  on  the  individual  data  sets. 
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This  results  in  unsmooth  estimates  with  extremely  high  variances,  as  in  Figure  3  where 
the  bandwidth  is  set  at  0.5. 


O  10  20  30  40  50 
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However,  using  the  “guess  and  check”  process,  one  can  eventually  come  up  with 
an  appropriate  bandwidth  value.  For  our  purposes,  it  was  most  accurate  to  use  a 
bandwidth  of  3  for  the  first  trajectory  and  a  bandwidth  of  2  for  the  second  trajectory. 
Using  these  values  produced  the  most  accurate  fit,  as  seen  in  figure  4. 
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Figure  4:  Accurate  Bandwidth 
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Though  there  is  some  error,  it  is  obvious  that  with  the  appropriate  bandwidth  the  kernel 
regression  procedure  can  be  an  accurate  smoothing  algorithm. 

II.  Smoothing  Splines 

The  smooth  spline  procedure  is  another  function  integrated  by  R  to  smooth  data. 
This  uses  a  combination  of  the  ordinary  least  squares  estimate  and  the  loess  smoothing 
procedure  (loess  procedure  explained  in  more  detail  later).  These  smoothing  splines 
adjust  the  level  of  smoothness  by  varying  the  curve  from  a  least  squares  linear 
approximation  to  a  cubic  approximation,  and  using  whichever  approximation  fits  the 
original  data  set  most  appropriately. 

A  spline  is  a  function  that  consists  of  several  polynomial  pieces  joined  together 
with  certain  smoothness  conditions.  A  spline  is  calculated  at  several  subintervals  of  an 
interval,  I.  The  subintervals  are  determined  by  a  certain  number  of  knots,  which  we 
determine.  The  knots  are  the  points  of  the  original  function  at  which  the  function 
changes  its  character  (e.g.  the  function  changes  slope  or  changes  direction). 

For  our  purposes,  we  chose  the  number  of  knots  for  the  first  trajectory  to  be  ten 
and  we  chose  the  number  of  knots  for  the  second  trajectory  to  be  null.  We  chose  ten  for 
the  first  trajectory  because  there  are  several  points  at  which  the  trajectory  changes 
character.  Specifying  ten  knots  seemed  to  work  the  best,  separating  the  trajectory  into  ten 
subintervals  and  calculating  a  spline  at  each  interval.  Because  the  second  trajectory  is 
simply  a  quadratic  function,  it  worked  best  to  specify  the  number  of  knots  to  be  null. 

This  is  because  the  smooth  spline  function  can  make  an  accurate  determination  of  how 
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many  knots  are  needed  if  the  function  is  “simple”  (simple  meaning  strictly  linear,  strictly 
quadratic,  strictly  cubic  etc.). 

Figure  5:  Smooth  Spline 


Figure  5  shows  our  interpolation  of  the  smooth  spline  procedure.  From  figure  5 
we  see  that  the  smooth  spline  procedure  is  extremely  accurate,  with  the  blue  line 
representing  our  smooth  spline  estimate.  Compared  to  the  other  procedures  we  used,  the 
smooth  spline  estimate  proved  to  be  the  most  accurate  procedure  at  tracking  the  original 
trajectory. 

Loess  regression 

“Loess”  stands  for  “locally  weighted  scatter  plot  smoother”.  The  procedure  is 
complicated,  but  may  be  roughly  outlined  as  follows:  given  data  (xpyt),i  =  1,2, we 
wish  to  estimate  y  for  some  given  value  of  x.  This  estimate  of  y  is  obtained  by  fitting  the 
weighted  quadratic  regression  model: 
y,  =A>+ A<X  -  x) + ^(x,  -  x)2  +  £•. 

Using  the  “tricube”  function  to  determine  weights 


where  h  is  known  as  the  “span”.  The  choice  of  a  large  value  of  h  will  general  produce  a 
very  smooth  curve,  and  that  of  a  small  value  a  more  jagged  one. 

In  this  tracking  program  we  used  loess  regression. 
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In  the  figure  the  loess  regression  doesn’t  read  the  estimates  in  the  beginning  as  we  would 
like,  but  follows  well  at  the  end.  The  red  and  green  lines  are  the  true  trajectories  that  we 
are  trying  to  find.  The  dots  are  the  estimates  from  the  EM  algorithm,  and  the  black  and 
blue  lines  the  smoothed  estimate  using  the  nonparametric  loess  regression. 
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